[herd] Global count of useful hardware update by maranget · Pull Request #1733 · herd/herdtools7

maranget · 2026-03-02T13:21:49Z

This draft PR is an evolution of PR #1730. The number of useful hardware updates is evaluated by a global scan of events at the very end of execution candidate construction. The scan counts the number of explicit write effects to locations that can be page table entries of values whose AF flag can be zero.

The technique sounds more general and should yield a better upper bound than the local technique of PR #1730. Some points are still to consider :

Simple identification of the case of a failing CAS that nevertheless writes the value read from memory into memory again. In such a case, there is no need to count another hardware update. (notice that, in PR [herd] Attempt to limit the number of spurious AF updates #1730, this situation is identifierd by miracle).
Totally untested on ASL.

lib/misc.ml

The function [delay_kont] can be used to extract the value returned by a monad (second argument below, type `'a`). ``` val delay_kont : string -> 'a t -> ('a -> 'a t -> 'b t) -> 'b t ``` The continuation function `(fun v mv -> ... )` can then examine the returned value `v` and combine the monad `mv` independently, which proves very convenient in many occasion. The change performed by this commit permits affine (_i.e._ one or zero effective occurrence), while linear usage (exactly one occurrence) was mandatory before.

+ Efficient group function: sort, then group. + Suffix based generators: - generate all suffixes, - cross product of suffixes.

When TTHM=HA or TTHM=HD are active, HW update of the AF flag is performed. This include the so-called "spurious" updates that are performed independently of test code. For efficiency reason we limit the number of such spurious updates to what is necessary. We do so by a global scan of the execution candidates counting the writes that may unset the AF flag in the final set of effects. Notice that we also consider the initial writes in this scan. We perform one optimisation: by exception, when a write effect value is the same as the value read by the same instruction from the same location, there is no need to add a supplementary spurious update as the (potential) update associated to the write that stored the value has already been counted and is sufficient.

maranget mentioned this pull request Mar 2, 2026

[herd] More efficient computation of atomic load X stores pairs #1735

Merged

maranget force-pushed the global-hwupdates branch from c9f7855 to bbcb779 Compare March 4, 2026 17:56

HadrienRenaud reviewed Mar 5, 2026

View reviewed changes

lib/misc.ml Show resolved Hide resolved

maranget force-pushed the global-hwupdates branch from bbcb779 to f760a93 Compare March 23, 2026 16:12

maranget marked this pull request as ready for review March 30, 2026 08:15

maranget mentioned this pull request Mar 30, 2026

[herd] Attempt to limit the number of spurious AF updates #1730

Closed

maranget added 5 commits March 30, 2026 11:23

[herd] Small simplification/optimisation

27d63da

[lib] Additions to the Misc module

6911b0d

+ Efficient group function: sort, then group. + Suffix based generators: - generate all suffixes, - cross product of suffixes.

[herd] Add tests

172203a

maranget force-pushed the global-hwupdates branch from f760a93 to 172203a Compare March 30, 2026 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[herd] Global count of useful hardware update#1733

[herd] Global count of useful hardware update#1733
maranget wants to merge 5 commits intomasterfrom
global-hwupdates

maranget commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

maranget commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants